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Figure 1: The spectra of the first nine intrinsic 
model functions (IMFs) as a function of the 
logarithm of the frequency. These IMFs are 
extracted with the EMD from the Gaussian 
white noises with 1 million (220) points, and 
each IMF contains signals at comparable 
scales. Assume Tn to be the averaged period of 
the nth IMF, the figure suggests log2(T, n+1)- 
log2(Tn)=l, giving Tn+I/Tn=2. In other words, 
the averaged period of the (n+l)th IMF doubles 
the period of the nth IMF. Thus, the EMD 
behaves as a filter bank, i. e . , a collection of 
band-pass filters. Image Credit: Bo-Wen Shen 
(reproduction ofWu et al., 2004, in a different 
form). 

Abstract 

In this study, we discuss the performance of the parallel ensemble empirical mode decomposition 
(EMD) in the analysis of tropical waves that are associated with tropical cyclone (TC) formation. To 
efficiently analyze high-resolution, global, multiple-dimensional data sets, we first implement multi- 
level parallelism into the ensemble EMD (EEMD) and obtain a parallel speedup of 720 using 200 
eight-core processors. We then apply the parallel EEMD (PEEMD) to extract the intrinsic mode 
functions (IMFs) from preselected data sets that represent (1) idealized tropical waves and (2) large- 
scale environmental flows associated with Hurricane Sandy (2012). Results indicate that the PEEMD 
is efficient and effective in revealing the major wave characteristics of the data, such as wavelengths 
and periods, by sifting out the dominant (wave) components. This approach has a potential for 
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hurricane climate study by examining the statistical relationship between tropical waves and TC 
formation. 

1 Introduction 

Recent studies using high-resolution model simulations and satellite data have shown a potential for 
improving our understanding of TC formation and thus extending the lead time of TC genesis 
prediction. In a series of papers with a high-resolution global model, Shen et al. (20 10a, b; 2012a) 
have conducted case studies to examine the relationship between TC formation and different kinds of 
tropical waves, including (i) TC Nargis (2008) and its association with an Equatorial Rossby (ER) 
wave; (ii) Hurricane Helene (2006) and its association with an intensifying African easterly wave; and 
(iii) Twin TCs (2002) and their association with a mixed Rossby-gravity (MRG) wave embedded in 
organized large-scale convective systems. Multiscale processes and their interactions that could lead 
to the formation of the aforementioned TCs are illustrated with advanced scientific 4-D (X-Y-Z-time) 
visualizations by Shen et al. (2013a). Based on these studies, it has been hypothesized that improved 
predictability of TC formation can be achieved by refining hierarchical scale interactions of a TC and 
its environmental flows, such as different types of tropical waves. 

To verify the above hypothesis comprehensively, it is crucial to perform multiscale analysis to 
identify the spatial and/or temporal distributions of tropical waves and their correlation with TC 
activities. The former is illustrated with the case of Hurricane Sandy in this study. The original EMD 
(Huang et al., 1998) or EEMD (Wu et al., 2009) was selected to perform the multiscale analysis as the 
involved processes are non- stationary and nonlinear. The EMD decomposes one set of observational 
data into the so-called intrinsic mode functions (IMFs). The EEMD was originally developed to 
overcome the scale (or mode) mixing problem that may appear in the original EMD. The EEMD deals 
with an ensemble of data sets, each of which contains the original observational data and finite 
amplitude white noise; one set of IMFs is obtained for each ensemble member with the EMD, and an 
ensemble average is applied to obtain the final IMFs. It has been shown that the decomposed mean 
IMFs using EEMD stay within the natural filter period windows, significantly reducing the chance of 
scale mixing while still preserving a filter bank property (e.g., Flandrin et al., 2003; We et al., 2004). 
Depending on the required level of accuracy in the decomposed IMFs, typical number of ensemble 
members used in EEMD is in the range of 200-400 or higher, tremendously increasing computational 
demands. Therefore, to reduce wall-time to solution, it becomes important to improve the 
performance of the EEMD algorithm. 

Recently, a hybrid multi-level parallelism has been implemented in the EEMD (Shen et al., 2012b; 
Cheung et al., 2013). In section 2.1, we briefly discuss the design of the parallelism in the parallel 
EEMD (PEEMD). In section 2.2, we first present the hierarchical filtering properties of the EMD, and 
then illustrate the performance of the PEEMD in the analysis of idealized tropical wave solutions. The 
latter quantitatively examines the differences between the specific IMF(s) and the corresponding 
analytical component(s) in the original data. In section 2.3, we apply the PEEMD to analyze the 
multiscale environmental flows that were associated with the formation and initial movement of 
Hurricane Sandy (2012). 

2. Parallel Ensemble EMD and its Applications 
2.1. Design of Parallelism in the PEEMD 

The EMD, first proposed by Huang et al. (1998) at NASA/GFSC in late 1990s, extracts a finite 
number of the so-called IMFs from the local extrema of the data. Considering the IMFs as basis 
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functions that represent the original data, the EMD is adaptive. In contrast, the Fourier transform and 
wavelet analysis as well are not adaptive because they are based on an a priori basis. Due to the 
complexity of real world data that reside at a wide range of time (or spatial) scales, it has been 
challenging for a basis, which is selected a priori, to capture the essential components that reflect the 
underlying physical processes in the data. As a result of the adaptive and local features, the EMD has 
the advantages of dealing with non-stationary and nonlinear data. 

The ensemble EMD (EEMD) method was developed to overcome the mode-mixing issues of the 
EMD (Wu et al., 2009). This method requires large computational resources that are linearly 
proportional to the number of ensemble trials. It becomes even more challenging to extend the one- 
dimensional EEMD to multidimensional EEMD for multiple fields at different vertical levels. Thus, a 
parallel version is proposed to reduce the time to obtain solutions. In our parallel EEMD (PEEMD), a 
three-level parallelism has been implemented with MPI tasks for the first two levels and with 
OpenMP threads in the third level. The first level parallelism decomposes a global (or limited-area) 
domain into M different sub-domains, running with M processes in parallel, while the second level 
decomposes the total number of ensemble trials into N groups, running with N processes in parallel. 
The third level parallelism is achieved with the fine-grain OpenMP inside each of the N processes in 
each of the M tasks. Major features of the newly-developed PEEMD include: (1) promising 
scalability that yields a speedup of up to 720 using 200 eight-core processors; (2) bit-by-bit 
consistency that assures the binary identical results with different CPU layouts; and (3) sustainability 
enabled by a well-designed interface, making it easy to plug in new versions of the EMD. More 
details in the parallel implementation are given by Cheung et al (2013). In the following sections, we 
discuss the performance of the PEEMD in the analysis of idealized wave solutions and environmental 
flows associated with Hurricane Sandy. 

2.2 Hierarchical Scale Decompositions of Idealized Tropical Wave Solutions 



Figure 2: Decompositions of 
idealized tropical wave solutions. 

(a, b, c) The zonal wind 
perturbations (U’), the zonal basic 
winds, and the total winds, 
respectively (reproduced from Shen 
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et al., 2012a). The first field is from 
an analytical mixed Rossby-gravity 
wave solution, while the second 
field is the solution of an idealized 
west wind belt (WWB). Their 
summation (U’+WWB) represents 
the total winds, (d, e) The sixth and 
ninth IMFs extracted from the total 
winds, sequentially, (f) The 
summation of the sixth and ninth 
IMFs (i.e., IMF6+IMF9). (g, h, i) 

The differences between IMFs and 
corresponding idealized 
components, which are (IMF6-U’), 

(IMF9-WWB) and (IMF6+IMF9 
-total winds), respectively. Note 
that the color scales in the bottom 
panels are different from those in 
the top and middle panels. The 
sixth IMF represents more than 90 
percent of the zonal wind 
perturbations (a,g), while the ninth 
IMF represents more than 99 
percent of the zonal basic winds 
(WWB) (b, h). Image Credit: Bo- 
Wen Shen. 

Following the discussions of Flandrin et al. (2003) and We et al. (2004), we first illustrate the unique 
feature of the EMD that behaves as a filter bank, i.e., a collection of band-pass filters. To achieve this, 
we apply the EMD to decompose Gaussian white noises with 1 million (2 ) points into 9 IMFs, and 
perform Fourier analysis to obtain the spectrum for each of the IMFs (Figure 1). The spectrum of each 
IMF displays a Gaussian distribution, containing signals within a range of frequencies. In general, a 
higher order IMF has lower frequencies. Assume fn (Tn) to be the averaged frequency (period) of the 
nth IMF, Figure 1 suggests an approximate relation: log2(fn)-log2(fn+l)=l, where n=2-8. This 
relation is equivalent to the following form: log2(Tn+l)-log2(Tn)=l, which gives Tn+1/Tn=2. In 
other words, the averaged period of the (n+l)th IMF doubles the period of nth IMF. This feature is 
similar to that of a dyadic filter. Numerical experiments suggest that such a feature can be clearly 
shown with sufficient data points. However, a typical set of observation or model data may not have 
so many points to clearly display the feature of the dyadic filter (with doubling periods). Nonetheless, 
without loss of generality and accuracy, it can be said that the EMD performs a special kind of 
hierarchical filtering. This feature is ideal for multiscale analysis. 

To verify the performance of the PEEMD in decomposing meteorological data into different wave 
modes, we begin with analyzing idealized wave solutions, which consist of an analytical solution of a 
mixed Rossby-gravity (MRG) wave as wind perturbations (U’), and an analytical form of a west wind 
belt (WWB) as basic winds. The summation of the two fields produces the total winds, i.e., 

U’+WWB. These wind fields are shown in the top panels of Figure 2 (see also Shen et al., 2012a). 

The total winds are then decomposed into nine IMFs. Only the sixth IMF, the ninth IMF and their 
summation are displayed in the middle panels of Figure 2. It is clear that the sixth IMF (Figure 2d) 
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and ninth IMF (Figure 2e) resemble the MRG wave mode (Figure 2a) and WWB (Figure 2b), 
respectively. The deviations of a specific IMF from the original component (shown in the bottom 
panels) suggest that the sixth IMF can capture more than 90 percent of the original MRG wave mode, 
while the ninth IMF can represent more than 99 percent of the original WWB component. Encouraged 
by these promising decompositions, we use the PEEMD to analyze the environmental flows 
associated with Hurricane Sandy. 

2.3 Multiscale Processes associated with Hurricane Sandy 

In late October 2012, Storm Sandy made landfall near Brigantine, New Jersey, devastating 
surrounding areas and causing tremendous economic loss and hundreds of fatalities (e.g., Shen et al. 
2013b and references therein). An estimated damage of $50 billion made Sandy the second costliest 
TC in US history, surpassed only by Hurricane Katrina (2005). A central question to be addressed is 
to what extent the lead time of severe storm prediction such as Sandy can be extended (e.g., Emanuel 
2012; Kerr 2012). In the study of Shen et al., (2013b), we demonstrated our model’s capability to 
realistically predict Sandy’s genesis with a lead time of up to six days and subsequent evolution for 
the next two-day period of October 22-24. The multiscale processes that may have contributed to the 
extended lead time are analyzed with the PEEMD below. 

Figure 3 shows the ECMWF-Interim zonal winds (Dee et al., 201 1) averaged over a two-day period 
of 00Z October 21-23 (Figure 3a), and the selected IMFs extracted from the zonal winds. While the 
third IMF reveals a dominant wave mode (Figure 3b), the fifth IMF represents a WWB at low 
latitudes (Figure 3c). Note that the spatial distributions of the zoomed-in third IMF (Figure 3d) 
appears to have some similarities to those in Figure 2a. Figure 4 shows the time longitude diagram of 
the meridional winds at 200 hPa and the selected IMFs, which are used to determine the wavelentgh 
and phase speed of the wave-like disturbance. Among the IMFs, the fourth IMF (Figure 4c) displays a 
more visible wave mode than the orignal data (Figure 4a). This wave mode has an estimated 
wavelentgh of 45 degrees (-4535.1 km). Plug this wavelength into theoretical wave dispersion 
relations (e.g., Shen et al. 2013b and references therein), we obtain an estimated period of 4.914 and 
7.407 days for an MRG and ER wave, respectively Therefore, analyses in Figures 3-4 provide the 
insightful understanding of the following multiscale processes associated with the formation and 
initial evolution of Sandy: (1) evolution of the (low-level) WWB and its interaction with the easterly 
wave (e.g., Figure 3c); and (2) the location of an upper-level trough, which is indicated by the white 
areas between 85 and 80 degrees west longitude and appeared in association with the MRG and ER 
waves (Figure 4c). 
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Figure 3: Decompositions of the low-level zonal winds from the ERA-Interim 
reanalysis data, (a) 850-hPa zonal winds averaged over a two-day period of 00Z 
October 21-23, 2012 (reproduced from Shen et al., 2013b). (b, c) The third and 
fifth IMFs extracted from the zonal wind fields, respectively, (d) A zoomed-in view 
of IMF 3 in a green box. The black dot at 12oN, 78.5oW indicates the vortex 
center of the two-day averaged 850-hPa winds. Note that a wave-like component 
is shown in (b) and (d), while a west wind belt is displayed at low latitudes 
shaded in red in (c). These two components appear to have some similarities to 
those in Figure 2a and 2b, respectively. Image Credit: Bo-Wen Shen. 

3. Concluding Remarks 

To efficiently and effectively reveal multiscale processes from high-resolution, global, multi- 
dimensional Earth Science data, we have developed the parallel version of the ensemble empirical 
model decomposition (PEEMD) method with a three-level parallelism. In this study, we illustrated the 
performance of the PEEMD in extracting tropical wave components from two different data sets: (1) 
idealized tropical wave solutions and (2) large-scale environmental flows associated with Sandy. 
Compared to conventional methods, our analyses with the PEEMD provide better representations of 
the dominant modes (e.g., MRG wave and WWB) associated with Sandy’s formation. Spatial and 
temporal correlation between tropical waves and hurricane formation are still under investigation. Our 
ultimate goal is to integrate the PEEMD with other modules (e.g., schemes of spectral analysis, 
readers for different satellite data, and numerical models) into the multiscale analysis package for 
hurricane climate study. 
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Figure 4: Decompositions of the upper-level meridional winds from the ERA-Interim reanalysis 
data, (a) Time-longitude diagram of meridian winds at 200-hPa averaged over 20 and 30 degrees 
north latitude (reproduced from Shen et al., 2013b). (b, c, d) The third, fourth and fifth IMFs 
extracted from the 200-hPa meridional winds, respectively. The locations of Sandy at 00Z Oct 22- 
24 are shown in black multiplication signs. Panel c displays an estimated wavelength of 45 degrees. 
Theoretical dispersion relations with this wavelength predict a period of 4.5 days for a mixed 
Rossby-gravity wave and 7.5 days for an equatorial Rossby wave. Image Credit: Bo-Wen Shen. 
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